<HTML>
<!-- (c) Copyright 2003, 2004, 2005, 2006, 2007, 2008, 2009 Hewlett-Packard Development Company, LP -->


<HEAD>
<TITLE>Jena Relational Database backend</TITLE>
<link href="../styles/doc.css" rel="stylesheet" type="text/css" />
</HEAD>



<BODY bgcolor="#FFFFFF">



<h1>Jena2 Database Interface - Porting Notes</h1>



<p>Jena2 was designed to be portable to other SQL database engines. The Jena2 
database dependencies are isolated in two places: the driver and the SQL 
scripts. These notes provide a brief overview of what is required to port Jena2 
to a new SQL database. However, the persistence architecture is fairly general 
and in the future we want to investigate support for other persistence engines 
such as Berkeley DB and a native Java store.</p>
<p>This document is known to be incomplete but is provided in the spirit of 
something is better than nothing. Porting to Jena2 is possible. DB2 was ported 
to a preliminary release of Jena2 without too much effort. If you are working on 
a port, please do not hesitate to ask for help by posting to the jena-devel 
list.</p>
<p>&nbsp;</p>
<h2>Porting Parameters</h2>
<p>Below is a list of parameters you need to know about your database engine 
before starting a port to Jena2.</p>
<font SIZE="2">
<dl>
  <dt>Object Column Type and Length</dt>
</font>
<font SIZE="1">
  <dd>This is the database column type that will be used to store character data 
  in statement tables. It must be support case-sensitive matching. Typically, 
  this is a Varchar column. But, in some some systems, Varchar is 
  case-insensitive (e.g., MySQL) or there are variants of Varchar (e.g., 
  Oracle). The maximum size of this column type should be determined in order to 
  set the LONG_OBJECT_LENGTH_MAX value. LONG_OBJECT_LENGTH should be set to be 
  equal or less than this maximum value depending on the degree of 
  denormalization desired.</dd>
</font>
<font SIZE="2">
  <dt>&nbsp;</dt>
  <dt>Index Column Type and Length</dt>
</font>
<font SIZE="1">
  <dd>This is the database column type that will be used to store character data 
  in the long object tables. Objects (literals and resources) whose length 
  exceeds LONG_OBJECT_LENGTH are not stored in the statement table but are 
  stored in long object tables (jena_long_lit, jena_long_uri). Each long value 
  is decomposed into a head and a tail. The head is that part of the long object 
  that can be indexed. The tail is the remaining (possibly large) portion that 
  is stored as a blob. Also computed is a content-based hash over the tail. The 
  head and hash enable exact matching.</dd>
</font>
<font SIZE="2">
  <dd>&nbsp;</dd>
</font>
<font SIZE="1">
  <dd>This index column type is then the column type for the head of the long 
  object. Typically, it is the same type as the object column type. The maximum 
  length of this column type determines INDEX_KEY_LENGTH_MAX. INDEX_KEY_LENGTH 
  should be set to be less than or equal this maximum value.</dd>
</font>
<font SIZE="2">
  <dt>&nbsp;</dt>
  <dt>Maximum Table and Index Name Length</dt>
</font>
<font SIZE="1">
  <dd>The maximum number of characters permitted in a table or index name is 
  needed for defining TABLE_NAME_LENGTH_MAX.</dd>
</font>
<font SIZE="2">
  <dt>&nbsp;</dt>
  <dt>Table and Index Name Case Sensitivity</dt>
</font>
<font SIZE="1">
  <dd>Some database engines require table and index names to be in upper case. 
  Set DB_NAMES_TO_UPPER appropriately for your database.</dd>
</font>
<font SIZE="2">
  <dt>&nbsp;</dt>
  <dt>Sequences</dt>
</font>
<font SIZE="1">
  <dd>Database engines have different ways of supporting automatic generation of 
  unique identifiers. Some systems use sequences, other systems have special 
  column types. It is assumed that the engine has some such mechanism. Jena will 
  use it to generate identifiers for long objects, graphs, etc. In some cases, 
  identifiers are allocated prior to inserting a row and in other cases the 
  identifier is allocated during or after row insertion. PRE_ALLOCATE_ID will 
  need to be set appropriately.</dd>
</font>
<font SIZE="2">
  <dt>&nbsp;</dt>
  <dt>Transactions</dt>
</font>
<font SIZE="1">
  <dd>It is assumed that the database engine supports transactions (bracketed by 
  begin and end operations). But, some engines only support atomic operations or 
  have variants that only support atomic operations (e.g., MySQL). Set 
  IS_XACT_DB appropriately.</dd>
</font>
<font SIZE="2">
  <dt>&nbsp;</dt>
  <dt>Quoting Character</dt>
</font>
<font SIZE="1">
  <dd>Database engines differ in their choice of the quote character (e.g., 
  single, double) to delimit character strings. Set QUOTE_CHAR to the quote 
  character for your engine.</dd>
</dl>
</font>
<font SIZE="2">
<dl>
  <dt>Strings Trimmed</dt>
</font>
<font SIZE="1">
  <dd>Some database engines truncate trailing white space at the end of 
  character strings. If the column type for your object or&nbsp; index columns 
  types does this, set STRINGS_TRIMMED to true and define a string termination 
  character (EOS) to prevent truncation of white space.</dd>
</dl>
<p>The driver code requires a few other miscellaneous parameters, e.g., database 
type, driver name, SQL file name. These should all be self-explanatory.</p>
</font>
<font SIZE="2">
<p>&nbsp;</p>
</font>
<h2>SQL Scripts</h2>
<p>In the <i>etc</i> subdirectory of the Jena2 distribution there is a file of 
SQL scripts, one for each supported database engine. These are SQL commands for 
fixed, well known database operations, e.g., to create a statement table, to 
retrieve a long object. To add a new engine, simply copy and edit one of the 
existing script files. The edits should be consistent with the parameter values. 
You may need to define more or fewer parameters for certain operations such as 
table creation. It may help to look at a few different script files as examples.</p>
<p>&nbsp;</p>
<h2>Driver Code</h2>
<p>Every database engine requires a Jena driver. Jena2 provides a generic driver 
(jena/db/DriverRDB.java) that provides default implementations for most of the 
driver methods. In the simplest case, the engine-specific driver only needs to 
override a few methods. As an example, the PostgreSQL driver is probably the 
simplest and the MySQL driver is almost as simple. The Oracle driver is more 
complicated because of the way Oracle handles large objects. The treatment of 
large objects is probably the most difficult part of porting.</p>
<p>Another customization that may be required for a new database engine is the 
dynamic generation of SQL code. For Fastpath queries and some types of 
reification queries, Jena2 generates SQL dynamically. If problems are 
encountered with these functions during development or testing, the generation 
code may need to be overridden in the engine-specific driver. The generation 
code is in DriverRDB.java and has method names of the form genSQLxxx.</p>
<p>&nbsp;</p>
<h2>Testing</h2>
<p>Once the driver and SQL file have been created, the next step is to check 
that the ModelRDB tests pass. There are a number of JUnit tests in jena/db/test. 
These may be run from the script <code>test-db.sh</code>, after modification for 
driver details and database name.</p>
<p>&nbsp;</p>



</BODY>



</HTML>